AITopics

Technology: Information Technology > Artificial Intelligence > Vision > Video Understanding (0.73)

Neural Information Processing SystemsNov-15-2025, 06:22:21 GMT

In most cases, the game designer is expected to first learn about the agents

We would like to thank all reviewers for reading our paper and providing constructive comments. Sometimes, the primary interest is to understand agent behaviors, and hence only the learning mode is needed. Alternatively, when all game inputs are known, the focus is on the intervention mode. In the final version, we will (i) explain in 2.1 how these We agree that it is neither rigorous nor necessary to assert that "most" Our work is inspired by the current interests on complex optimization-based layers. It is the first to treat VIs as individual layers in the end-to-end framework.

designer, game designer, vi problem, (16 more...)

Industry: Leisure & Entertainment > Games > Computer Games (0.41)

Technology:

Information Technology > Artificial Intelligence > Games > Computer Games (0.41)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.35)

Neural Information Processing SystemsAug-16-2025, 05:56:48 GMT

c21f4ce780c5c9d774f79841b81fdc6d-AuthorFeedback.pdf

designer, end-to-end framework, vi problem, (15 more...)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.30)

Neural Information Processing SystemsMay-26-2025, 21:27:14 GMT

EfficientCAPER: An End-to-End Framework for Fast and Robust Category-Level Articulated Object Pose Estimation

artificial intelligence, efficientcaper, video understanding, (4 more...)

Technology: Information Technology > Artificial Intelligence > Vision > Video Understanding (0.74)

arXiv.org Artificial IntelligenceMar-31-2025

AI2Agent: An End-to-End Framework for Deploying AI Projects as Autonomous Agents

Chen, Jiaxiang, Shi, Jingwei, Gan, Lei, Zhang, Jiale, Zhang, Qingyu, Zhang, Dongqian, Pang, Xin, Li, Zhucong, Xu, Yinghui

As AI technology advances, it is driving innovation across industries, increasing the demand for scalable AI project deployment. However, deployment remains a critical challenge due to complex environment configurations, dependency conflicts, cross-platform adaptation, and debugging difficulties, which hinder automation and adoption. This paper introduces AI2Agent, an end-to-end framework that automates AI project deployment through guideline-driven execution, self-adaptive debugging, and case \& solution accumulation. AI2Agent dynamically analyzes deployment challenges, learns from past cases, and iteratively refines its approach, significantly reducing human intervention. To evaluate its effectiveness, we conducted experiments on 30 AI deployment cases, covering TTS, text-to-image generation, image editing, and other AI applications. Results show that AI2Agent significantly reduces deployment time and improves success rates. The code and demo video are now publicly accessible.

large language model, machine learning, natural language, (20 more...)

2503.23948

Country: Asia > China > Shanghai > Shanghai (0.04)

Genre:

Workflow (0.70)
Research Report > New Finding (0.35)

Industry: Media (0.36)

Technology:

Information Technology > Artificial Intelligence > Vision (0.91)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.67)

Lee, Yongjoon, Kim, Chanwoo

Wave-U-Mamba: An End-To-End Framework For High-Quality And Efficient Speech Super Resolution

arXiv.org Artificial IntelligenceSep-17-2024

Speech Super-Resolution (SSR) is a task of enhancing low-resolution speech signals by restoring missing high-frequency components. Conventional approaches typically reconstruct log-mel features, followed by a vocoder that generates high-resolution speech in the waveform domain. However, as log-mel features lack phase information, this can result in performance degradation during the reconstruction phase. Motivated by recent advances with Selective State Spaces Models (SSMs), we propose a method, referred to as Wave-U-Mamba that directly performs SSR in time domain. In our comparative study, including models such as WSRGlow, NU-Wave 2, and AudioSR, Wave-U-Mamba demonstrates superior performance, achieving the lowest Log-Spectral Distance (LSD) across various low-resolution sampling rates, ranging from 8 kHz to 24 kHz. Additionally, subjective human evaluations, scored using Mean Opinion Score (MOS) reveal that our method produces SSR with natural and human-like quality. Furthermore, Wave-U-Mamba achieves these results while generating high-resolution speech over nine times faster than baseline models on a single A100 GPU, with parameter sizes less than 2% of those in the baseline models.

architecture, spectrogram, wave-u-mamba, (15 more...)

2409.09337

Country: Asia > South Korea > Seoul > Seoul (0.04)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.58)

Technology:

Information Technology > Artificial Intelligence > Speech (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Bao, Lingfan, Humphreys, Joseph, Peng, Tianhu, Zhou, Chengxu

Deep Reinforcement Learning for Bipedal Locomotion: A Brief Survey

arXiv.org Artificial IntelligenceApr-25-2024

Bipedal robots are garnering increasing global attention due to their potential applications and advancements in artificial intelligence, particularly in Deep Reinforcement Learning (DRL). While DRL has driven significant progress in bipedal locomotion, developing a comprehensive and unified framework capable of adeptly performing a wide range of tasks remains a challenge. This survey systematically categorizes, compares, and summarizes existing DRL frameworks for bipedal locomotion, organizing them into end-to-end and hierarchical control schemes. End-to-end frameworks are assessed based on their learning approaches, whereas hierarchical frameworks are dissected into layers that utilize either learning-based methods or traditional model-based approaches. This survey provides a detailed analysis of the composition, capabilities, strengths, and limitations of each framework type. Furthermore, we identify critical research gaps and propose future directions aimed at achieving a more integrated and efficient framework for bipedal locomotion, with potential broad applications in everyday life.

international conference, locomotion, robot, (15 more...)

2404.1707

Country:

Europe > United Kingdom > England > West Yorkshire > Leeds (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.93)

Industry:

Health & Medicine (0.68)
Leisure & Entertainment (0.46)
Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

arXiv.org Artificial IntelligenceApr-16-2024

Learning and Optimization for Price-based Demand Response of Electric Vehicle Charging

Gu, Chengyang, Pan, Yuxin, Liu, Ruohong, Chen, Yize

In the context of charging electric vehicles (EVs), the price-based demand response (PBDR) is becoming increasingly significant for charging load management. Such response usually encourages cost-sensitive customers to adjust their energy demand in response to changes in price for financial incentives. Thus, to model and optimize EV charging, it is important for charging station operator to model the PBDR patterns of EV customers by precisely predicting charging demands given price signals. Then the operator refers to these demands to optimize charging station power allocation policy. The standard pipeline involves offline fitting of a PBDR function based on historical EV charging records, followed by applying estimated EV demands in downstream charging station operation optimization. In this work, we propose a new decision-focused end-to-end framework for PBDR modeling that combines prediction errors and downstream optimization cost errors in the model learning stage. We evaluate the effectiveness of our method on a simulation of charging station operation with synthetic PBDR patterns of EV customers, and experimental results demonstrate that this framework can provide a more reliable prediction model for the ultimate optimization process, leading to more effective optimization solutions in terms of cost savings and charging station operation objectives with only a few training samples.

customer, demand response, ev customer, (14 more...)

2404.10311

Country:

Asia > China > Hong Kong (0.04)
North America > United States > California (0.04)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
Asia > China > Guangdong Province > Guangzhou (0.04)

Genre: Research Report > New Finding (0.34)

Industry:

Transportation > Ground > Road (1.00)
Transportation > Electric Vehicle (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.70)
Information Technology > Modeling & Simulation (0.69)

arXiv.org Artificial IntelligenceFeb-9-2023

An End-to-End Framework for Marketing Effectiveness Optimization under Budget Constraint

Yan, Ziang, Wang, Shusen, Zhou, Guorui, Lin, Jingjian, Jiang, Peng

Online platforms often incentivize consumers to improve user engagement and platform revenue. Since different consumers might respond differently to incentives, individual-level budget allocation is an essential task in marketing campaigns. Recent advances in this field often address the budget allocation problem using a two-stage paradigm: the first stage estimates the individual-level treatment effects using causal inference algorithms, and the second stage invokes integer programming techniques to find the optimal budget allocation solution. Since the objectives of these two stages might not be perfectly aligned, such a two-stage paradigm could hurt the overall marketing effectiveness. In this paper, we propose a novel end-to-end framework to directly optimize the business goal under budget constraints. Our core idea is to construct a regularizer to represent the marketing goal and optimize it efficiently using gradient estimation techniques. As such, the obtained models can learn to maximize the marketing goal directly and precisely. We extensively evaluate our proposed method in both offline and online experiments, and experimental results demonstrate that our method outperforms current state-of-the-art methods. Our proposed method is currently deployed to allocate marketing budgets for hundreds of millions of users on a short video platform and achieves significant business goal improvements. Our code will be publicly available.

artificial intelligence, gradient, machine learning, (15 more...)

2302.04477

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > Montserrat (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (1.00)

Industry:

Information Technology (0.68)
Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Data Science (0.93)

Jalaboi, Raluca, Faye, Frederik, Orbes-Arteaga, Mauricio, Jørgensen, Dan, Winther, Ole, Galimzianova, Alfiia

DermX: an end-to-end framework for explainable automated dermatological diagnosis

arXiv.org Artificial IntelligenceOct-3-2022

Dermatological diagnosis automation is essential in addressing the high prevalence of skin diseases and critical shortage of dermatologists. Despite approaching expert-level diagnosis performance, convolutional neural network (ConvNet) adoption in clinical practice is impeded by their limited explainability, and by subjective, expensive explainability validations. We introduce DermX and DermX+, an end-to-end framework for explainable automated dermatological diagnosis. DermX is a clinically-inspired explainable dermatological diagnosis ConvNet, trained using DermXDB, a 554 image dataset annotated by eight dermatologists with diagnoses, supporting explanations, and explanation attention maps. DermX+ extends DermX with guided attention training for explanation attention maps. Both methods achieve near-expert diagnosis performance, with DermX, DermX+, and dermatologist F1 scores of 0.79, 0.79, and 0.87, respectively. We assess the explanation performance in terms of identification and localization by comparing model-selected with dermatologist-selected explanations, and gradient-weighted class-activation maps with dermatologist explanation maps, respectively. DermX obtained an identification F1 score of 0.77, while DermX+ obtained 0.79. The localization F1 score is 0.39 for DermX and 0.35 for DermX+. These results show that explainability does not necessarily come at the expense of predictive power, as our high-performance models provide expert-inspired explanations for their diagnoses without lowering their diagnosis performance.

artificial intelligence, machine learning, natural language, (19 more...)

2202.06956

Country:

North America > United States (0.14)
Europe > Denmark > Capital Region > Kongens Lyngby (0.14)
Europe > Denmark > Capital Region > Copenhagen (0.04)
Europe > United Kingdom (0.04)

Genre: Research Report > New Finding (0.87)

Industry: Health & Medicine > Therapeutic Area > Dermatology (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(4 more...)